Linking Entities in Short Texts Based on a Chinese Semantic Knowledge Base

نویسندگان

  • Yi Zeng
  • Dongsheng Wang
  • Tielin Zhang
  • Hao Wang
  • Hongwei Hao
چکیده

Populating existing knowledge base with new facts is important to keep the knowledge base fresh and most updated. Before importing new knowledge into the knowledge base, entity linking is required so that the entities in the new knowledge can be linked to the entities in the knowledge base. During this process, entity disambiguation is the most challenging task. There have been many studies on leveraging name ambiguity problem via a variety of algorithms. In this paper, we propose an entity linking method based on Chinese Semantic Knowledge where entity disambiguation can be addressed by retrieving a variety of semantic relations and analyzing the corresponding documents with similarity measurement. Based on the proposed method, we developed CASIA_EL, a system for linking entities with knowledge bases. We validate the proposed method by linking 1232 entities mined from Sina Weibo to a Chinese Semantic knowledge base, resulting in an accuracy of 88.5%. The results show that the CASIA_EL system and the proposed algorithm are potentially effective.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Simple Yet Effective Method for Entity Linking in Microblog-Genre Text

Semantic analysis microblog data is a challenging, emerging research area. Unlike news text, microblogs pose several new challenges, due to their short, noisy, contextualized and real-time nature. In this paper, we investigate how to link entities in microblog posts with knowledge base and adopt a cascade linking approach. In particular, we first use a mention expansion model to identify all po...

متن کامل

Overview of the TAC 2010 Knowledge Base Population Track

In this paper we give an overview of the Knowledge Base Population (KBP) track at TAC 2010. The main goal of KBP is to promote research in discovering facts about entities and expanding a structured knowledge base with this information. A large source collection of newswire and web documents is provided for systems to discover information. Attributes (a.k.a. “slots”) derived from Wikipedia info...

متن کامل

Time-Aware Entity Linking

Entity Linking is the task of automatically identifying entity mentions in a piece of text and linking them to their corresponding entries in a reference knowledge base like Wikipedia. Although there is a plethora of works on entity linking, existing state-of-the-art approaches do not explicitly consider the time aspect and specifically the temporality of an entity’s prior probability (populari...

متن کامل

Computer-Aided Grammar Acquisition in the Chinese Understanding System CUSAGA

CALAS is a subsystem for acquiring semantic grammars to be used in CUSAGA which can understand technical Chinese texts and extract knowledge from them. The semantic grammar is acquired in a semi—automatic way under the guidance of the user. CUSAGA is implemented on UV68000 with about 12000 PASCAL lines. This paper gives a short overview on the architcc-lure and functions of CUSAGA and a more de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013